Sqoop Import in Files

Importing data in diffrent File Format

sqoop import --options-file /home/cloudera/Desktop/data/pram.txt -m 1 --table emp --as-textfile --target-dir=/user/cloudera/sqoop;
sqoop import --options-file /home/cloudera/Desktop/data/pram.txt -m 1 --table emp --as-sequencefile –target-dir=/sqoop/IMPORT_SEQFILE;
 
AVRO FILE
Avro stores both the data definition and the data together in one message or file making it easy for programs to dynamically understand the information stored in an Avro file or message. Avro stores the data definition in JSON format making it easy to read and interpret, the data itself is stored in binary format making it compact and efficient. Avro files include markers that can be used to splitting large data sets into subsets suitable for MapReduce processing.
sqoop import --options-file /home/cloudera/Desktop/data/pram.txt -m 1 --table emp --as-avrodatafile --target-dir=/sqoop/IMPORT_AVROFILE;

sqoop import --options-file /home/cloudera/Desktop/data/pram.txt -m 1 --table emp --as-avrodatafile --avroschemafile=/sqoop/IMPORT_AVROFILE --target-dir=/sqoop/IMPORT_AVROFILE;

hadoop fs -cat /sqoop/IMPORT_AVROFILE/part-m-00000.avro


Convert AVROfile into AVSC
AVSC:- It contain Schema of the AVRO file. We can create a new table using the avsc schema file.
hadoop fs -get /sqoop/IMPORT_AVROFILE/part-m-00000.avro /home/cloudera/Desktop/data/emp.avro java -jar /usr/lib/avro/avro-tools-1.7.6-cdh5.13.0.jar getschema /home/cloudera/Desktop/data/emp.avro > //home/cloudera/Desktop/data/emp.avsc
ls -l /home/cloudera/Desktop/data/emp.avsc
hdfs dfs -put /home/cloudera/Desktop/data/emp.avsc /user/hive/warehouse/
create external table emp_avro stored as avro location'/sqoop/IMPORT_AVROFILE/'
tblproperties('avro.schema.url'='/user/hive/warehouse/emp.avsc');
create external table emp_avro2 stored as avro
tblproperties('avro.schema.url'='/user/hive/warehouse/emp.avsc');
sqoop import --connect jdbc:mysql://localhost/dwdev --username root --password cloudera -m 1 --table emp --as-avrodatafile –target-dir=/sqoop/IMPORT_AVROFILE1 --outdir java_code;

No comments:

Post a Comment